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Abstract 

We investigated distributions of short term price trends for high frequency stock 
market data. A number of trends as a function of their lengths was measured. We 
found that such a distribution does not fit to results following from an uncorre- 
cted stochastic process. We proposed a simple model with a memory that gives a 
qualitative agreement with real data. 
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1 Introduction 



Statistical analysis of stock prices is a rich source of information about the 
nature of financial markets. It was Louis Bachelier who used a stochastic 
approach to model financial time series for the first time [Tj. Since that time 
the statistical analysis of stock prices has become a widely investigated area 
of interdisciplinary researches [2..3JU3]. 

In 1973, Fischer Black and Myron Scholes published their famous work j6] 
where they presented a model for pricing European options. They assumed 
that a price of an asset can be described by a geometric Brownian motion. 
However, the behaviour of real markets differs from the Brownian property 
[7115] . since the price returns form a truncated Levy distribution [9"|lU|lllj . As a 
result of this observation many non-Gaussian models were introduced [2|3"f4"] . 
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Another divergence from Gaussian behaviour is an autocorrelation in financial 
systems. Empirical studies show that the autocorrelation function of the stock 
market time series decays exponentially with a characteristic time of a few 
minutes, while the autocorrelation of prices absolute values decays slower, as 
a power law function, what leads to a volatility clustering [T2"fl3|ll4|[T5] . 




The issue of market memory was also considered by many authors (see refer- 
ences in |2|3] ). It was observed [16], that for certain time scales, a sequence 
of two positive price changes leads more frequently to a subsequent positive 
change than a sequence of mixed changes, i.e. the conditional probability 
P(+\ + +) is larger than P(+\ H — )• In this paper we investigated this effect 
for high frequency stock market data. 



2 Empirical data 

Let us consider short term price trends for high frequency stock market data. 
By short term uptrend/ downtrend we mean such a sequence of prices that a 
price is larger /smaller than the preceding one (see below for a more precise 
definition). 

First, having a time series Y t , which is in our case a history of a stock price or 
a market index, we build a series of variables St in the following way: 



A positive value of the variable St means that at the time t the price Y t did not 
decrease, and similarly a negative value means that the price did not increase. 

In a series St we can distinguish subseries of identical values. For a < b and 
s a = Sb = s, S(a, b, s) is such a subseries if and only if V ce ( aj fc) s c = s. Subseries 
S(a, b, s) can be identified with an uptrend lasting from t = a till t = b for s = 
1 , and with a downtrend for s = —1. The length I of such a uptrend / downtrend 
is equal to b — a + 1. Let us mention that a subseries of a length I includes two 
subseries of length I — 1, three subseries of length I — 2 etc. 

Let N(l) be a number of subseries of a length / with a fixed s in a series 
s%, ...,sm- If s t were generated by an uncorrelated discrete stochastic process 
with a probability P(s t = 1) — p, then the expected value of N(l) would be 
equal to: 



where M is a number of all elements in the basic series. Similarly the expected 




• s t = liiY t >Y t _ l , 

• s t = -1 if Y t <Y t _ 

• s t = s t -i if Y t = Y t 



N(l) = (M-Z + iy, 



(1) 
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Fig. 1. Distribution N(l) for uptrends (a) and downtrends (b) for the WIG20 index 
between the 13th June 2003 and the 3rd November 2006. Data are sampled every 
15 seconds (circles), and were compared to uncorrelated process (line) eq. (P). 



value of downtrend series of length I is N(l) = (M — I + 1)(1 — p) 1 . 



We have measured the distribution N(l) for real market data and the same 
distribution for the corresponding uncorrelated process. Figured] presents this 
distribution for the WIG20 index of Warsaw Stock Exchange (WSE) between 
the 13th June 2003 and the 3rd November 2006, and the distribution for the 
corresponding uncorrelated process. The results show a significant difference 
between real data and the uncorrelated model. If variables St were uncorre- 
lated, there would not be subseries longer than 25 ticks. In fact, subseries even 
longer than 100 ticks are present. The trends last for about 30 minutes. There 
are far more such trends than it would be if the process were uncorrelated. The 
distribution N(l) was also calculated for particular stocks from WSE, NYSE 
and NASDAQ (fig. |2J). The stocks from WSE were: Bioton between the 31st 
March 2005 and the 3rd November 2006, and TPSA between the 17th Novem- 
ber 2000 and the 3rd November 2006. The stock from NYSE was Apple, and 
the stock from NASDAQ was Intel, both between the 4th January 1999 and 
the 29th December 2000. For the index WIG20 trend periods were measured 
in real time, but for the stocks they were measured in a transaction time (see 
section 4). 



The observed difference between the uncorrelated model and the real mar- 
kets is due to strong autocorrelations in the process s t . It is only seen in high 
frequency data. Choosing every n-th element of the series s t weakens the au- 
tocorrelations, and makes the outcome approaching the uncorrelated model 
with growing n. It is shown in fig. [3j 
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Fig. 2. Distribution N(l) for: (a) BIOTON between the 31st March 2005 and the 
3rd November 2006 (WSE), (b) TPSA between the 17th November 2000 and the 
3rd November 2006 (WSE), (c) APPLE between the 4th January 1999 and the 29th 
December 2000 (NYSE), (d) INTEL between the 4th January 1999 and the 29th 
December 2000 (NASDAQ). Uptrends are plotted with circles and downtrends are 
plotted with squares. All data are sampled tick by tick. Lines correspond to the 
uncorrelated process ([I]). 
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Fig. 3. N(l) of WIG20 uptrends for every n-th element of St, n=l circles, n=2 
squares, n=4 diamonds, n=6 triangles. 
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3 A phenomenological model of correlated market prices 

In real markets variables St and St+ T are correlated, although this correla- 
tions decay very fast. Let r(k) stand for a conditional probability P(s n+ k+i = 
l\s n+ k = l,...,s n = l,s n -\ = —1), which is independent of n. For processes 
where autocorrelations are present we can write a generalization of equation 

N(l + l) = {M-l)pf[r(i), (2) 

8=1 

for I > and N(l) = Mp. 

Let us see that the result ([2]) is equivalent to ([1]) if for any x there is r(x) = p. A 
key issue is to model r(x) in order to describe characteristics of a given market. 
The results presented in fig. [T||2] show that the model of the uncorrelated 
process (pQ), is a poor simplification. To get a better consistency with real data 
the function r(x) can be modelled as: 

r(x) = a(x — x\)(x — X2), (3) 

with fitted parameters a, x\ and xi- A binomial function was chosen because 
we are looking for a simple concave function with a maximum, and a binomial 
function matches our requirements for proper parameters a, x±, Xi- 

We expect that for small x, the probability value r(x) increases with x. It 
means that when the trend starts forming, investors follow it and, as a result, 
they amplify the trend. Thus, the probability of a continuation of the price 
movement grows. As time goes by, some of them may want to withdraw to 
take profits, and those who are out of the market believe it is to late to get 
in. This causes a decrease of r(x) for a longer trend. 

One can choose various functions to model the probability r(x). All such 
functions should be concave functions with a maximum for a positive argument 
smaller than the maximal length of all subseries. The figure H] presents the 
distribution N(l) with a fitted curve based on (J3]). 

Putting ([3]) into ([2]) we get after some algebra an approximated form of the 
function N(l) as: 

N(l) ~ (M — / + l)pe- 2(i - 1} [a(/ - Xl )(l - a^)]^ 1 

( I - x 1 \ 1 ~ x W I - x 2 \ l ~ X2 (4) 
X U -xj \l-x 2 ) 

The figure [5] presents functions ([2]) and (j4j) and a relative difference between 
them. 
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Fig. 4. The distribution N(l) with fitted curve ([2]) (a) WIG20 (WSE) uptrends with 
fitted parameters: a = -0.000098, x x = -48.28, x 2 = 153.31, (b) TP S.A. company 
(WSE) uptrends with fitted parameters: a = -0.000057, x 1 = -73.42, x 2 = 184.35. 
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Fig. 5. Graph (a) presents the distribution Ni(l) obtained from the equation ([2]) and 
N 2 (l) from the approximation (T4|) for the WIG20 index (WSE). Graph (b) presents 
the relative difference between them: d = (N 2 (l) — Ni(l))/N 2 (l). 



For a given set of parameters a + = —0.000098, xf = —48.28, x\ = 153.31, 
or = —0.000101, Xi = —47.32, x 2 = 150.69, obtained for uptrends and 
downtrends in the WIG20 index respectively, one can simulate the stochastic 
process according to eq. (J21 [3]) . The autocorrelation function 

C(t) = (s i+T Si), (5) 

calculated for such a process, decrease similarly to the autocorrelation func- 
tion received from empirical data (see fig. [6]). The model reflects short range 
correlations of the sign. 
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Fig. 6. The autocorrelation function obtained from eq. ([5]) for parameters 
a + = -0.000098, xf = -48.28, x\ = 153.31, a' = -0.000101, x\ = -47.32, 
x~2 = 150.69 (line) and the index WIG20 itself (circles). 



4 Measuring trends in volume and volatility times 

In previous chapters we presented an analysis of price trends measured in real 
and transaction times. The WIG20 index is published every 15 seconds, and 
for this index the 15-seconds data are the most frequent possible. Investigating 
these data we naturally used the real time with a 15-seconds interval, in which 
the sequence " + 1, +1, +1, +1" means that the index did not decrease during 
one minute. 

For the stocks of companies tick-by-tick data are accessible, thus a transaction 
time is a natural measure of a time length. Transaction time can be defined 
as: 



where t« is the real time of the transaction i. We used times r t for data analysis 
of single stocks. In this case the sequence +1, +1, +1" means that the 
price did not decrease during four subsequent transactions. 

Other time definitions are also possible. One of them is the volume time [17] 
defined in a standard way as: 




(6) 




(7) 



where Vi is the volume of transaction i. 
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Fig. 7. The distribution L(t v ) of numbers of trend periods of a length t v measured 
in a volume time, for: (a) PKN Orlen (WSE), (b) TPSA (WSE). Data were binned 
with a bin of the width 100. 
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Fig. 8. The distribution N(t v ) for: (a) PKN Orlen (WSE), (b) TPSA (WSE). 

We repeated our analysis, measuring lengths of trend periods, using the volume 
time instead of the transaction time. In figure [7| we presented distributions 
of the trend periods of exact length t v (L(t v )) for stocks PKN Orlen and 
TPSA of Warsaw Stock Exchange, both from the period between the 17th 
November 2000 and the 3rd November 2006. Let us stress that contrary to the 
distribution N(l) which was, in a sense, cumulative, L(t v ) is non-cumulative, 
because it shows only trends of the exact length r v . 

The distribution N(t v ) was presented in fig. [HJ One can see that there is an 
inflection point within the range of the variable v. It resembles the shape of 
the function N(l) from the fig. [T] and [2J which also possesses an inflection 
point. 

By analogy to the volume time ([7j) we can define the volatility time: 

Ta(U) =T a (U^) +a(U) } (8) 

where cr(tj) is the local volatility at the time U. 
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Fig. 9. The binned distribution L(r a ), it is a number of trend periods of a length 
r a measured in a volatility time, for: (a) PKN Orlen (WSE), (b) TPS A (WSE). 



The volatility at time ti was defined as an absolute value of a log-return for a 
transaction at time it is 

a(t i ) = \log(P(t i )/P(t^ 1 ))\, (9) 

where P(t%) is the price of a stock at time t{. For such a defined volatility we 
measured trend lengths and presented the distribution of the lengths L(a) in 

fig.H 



5 Conclusions 

We have investigated short term price trends for high frequency stock market 
data. It turned out that the statistics for real markets is significantly different 
from the statistics of uncorrelated processes. Longer trends (of the order of 
several minutes) are much more frequent than they should be, if one used an 
uncorrelated model. 

The investigations have been repeated for trends measured in volume and 
volatility time. The distribution of trends in volume time N(t v ) has similar 
behaviour to the function N(l). 

We proposed a simple model that qualitatively captures the behaviour of the 
market. The model leads to a distribution of trend series N(l) that is similar 
to the distribution observed in market data. Our model produces also short 
range correlations. This behaviour is caused by the conditional probability of 
trend continuation that changes nonmonotonically with a trend length. At the 
beginning of the trend, the probability of the trend continuation grows, then 
it hits the maximum and finally decreases. As a result, trends posses limited 
lengths. 
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